How to call an Ollama-based AI text embeddings model from SQL Server 2025

Comments 0

Share to social media

In the first article of this series, I explained how to install and configure Ollama to host text embeddings models locally. I also demonstrated how to install Caddy as a proxy to allow SQL Server to use Ollama via https-based calls. In this article, I’ll show you how to make use of this at the SQL Server end.

No time to read? Here’s a quick overview:
To generate text embeddings in SQL Server using a locally hosted Ollama model, enable REST endpoints with sp_configure, define the model once using CREATE EXTERNAL MODEL, then call AI_GENERATE_EMBEDDINGS whenever you need a vector. This avoids messy REST code every time and keeps your embedding logic clean and reusable.

Which text embeddings models should I use in Ollama?

You’ll find that Ollama can download and run many text embeddings models. There is a trade-off between the size of the models, the number of vector dimensions they return, and their effectiveness.

The models I commonly use are:

Model NameVector Dimensions
all-minilm384
nomic-embed-text768
mxbai-embed-large1024

Even the all-minilm model is surprisingly effective with standard text. The number of dimensions is important, as vector data can quickly become some of the largest data in your databases if you aren’t careful.

How to test the Caddy proxy

We need to start by enabling the ability to call REST-based services:

That allows us to use the sp_invoke_external_rest_endpoint system stored procedure to make REST-based calls.

Now that we have Ollama and Caddy up and running, we can check that SQL Server can actually call them by executing the following code:

You should see a http status code of 200 returned:

However, if you receive an error that looks like a certificate error, you’ll need to make sure you trust the root certificate for Caddy. Open PowerShell and, in the folder where you have downloaded Caddy, execute the following:

.\caddy_windows_amd64.exe environ

Look for the value of the caddy.AppDataDir. On my system, that was C:\Users\Greg\AppData\Roaming\Caddy. In that folder, navigate to the subfolder pki\authorities\local and, in here, you’ll find the Caddy root certificate root.crt.

Using Windows certificate manager (as an administrator), import that certificate into the Certificates subfolder under Trusted Root Certification Authorities. If you have old Caddy certificates, you should remove them first. You’ll then need to restart Caddy to complete the process.

How to retrieve text embeddings via SQL Server REST

Given we can make REST-based calls, we could just directly call the Ollama service via the Caddy proxy by doing this:

If you inspect the returned response, you’ll find the vector for the embeddings.

Fast, reliable and consistent SQL Server development…

…with SQL Toolbelt Essentials. 10 ingeniously simple tools for accelerating development, reducing risk, and standardizing workflows.
Learn more & try for free

Using CREATE EXTERNAL MODEL

Now, while we could then extract the embeddings from that returned JSON, having all this code every time we want to call a model is messy. Instead, SQL Server allows us to define an EXTERNAL MODEL that remembers how to call the model:

The model isn’t a schema-bound object, so we don’t have a schema as part of the name. The LOCATION is just the URL we need to call – or the folder if we’re using an ONNX (Open Neural Network Exchange) runtime.

The API_FORMAT lets SQL Server know what to expect when calling and receiving a response. Currently, it can be one of either Azure OpenAI, OpenAI, Ollama, or ONNX Runtime.

The MODEL_TYPE is curious. At present, it can only be EMBEDDINGS. However, whenever you see a required parameter and only one permitted value, you just know that the SQL Server team are considering other uses for this.

Finally, we specify the model to use, and that’s it. It’s now easier for SQL Server to make calls to the model.

Using AI_GENERATE_EMBEDDINGS

Once the external model is created, we can call it by using the AI_GENERATE_EMBEDDINGS function. That simplifies the code for working with embeddings.

Do bear in mind that, before you can use it, that external AI runtimes must be enabled. Depending on the specific version of SQL Server you’re using, you might also need to enable preview features:

Note that the model that I mentioned here is the name we gave to the external model. I often name the external model based on the text model I’m calling.

Summary and next steps

Creating an external model and then calling it via AI_GENERATE_EMBEDDINGS makes it easy to work with text embeddings in SQL Server. One further aspect to consider is which text to generate the embedding for. SQL Server has provided the AI_GENERATE_CHUNKS function to help us with that. We’ll explore that in the next article.

Simple Talk is brought to you by Redgate Software

Take control of your databases with the trusted Database DevOps solutions provider. Automate with confidence, scale securely, and unlock growth through AI.
Discover how Redgate can help you

FAQs: How to call an Ollama-based AI text embeddings model from SQL Server 2025

1. Which Ollama text embedding model should I use with SQL Server?

Common choices are all-minilm (384 dimensions), nomic-embed-text (768 dimensions), and mxbai-embed-large (1024 dimensions). Even all-minilm is surprisingly effective for standard text, and fewer dimensions help keep vector storage manageable.

2. How do I enable SQL Server to call Ollama via REST?

Run sp_configure 'external rest endpoint enabled', 1 followed by RECONFIGURE. This allows you to use sp_invoke_external_rest_endpoint to make HTTPS calls to Ollama through your Caddy proxy.

3. What is CREATE EXTERNAL MODEL in SQL Server?

CREATE EXTERNAL MODEL defines a reusable reference to an embedding model, storing its URL, API format (Ollama, OpenAI, Azure OpenAI, or ONNX Runtime), and model name, so you don’t have to write REST call code every time.

4. How do I generate embeddings in SQL Server with AI_GENERATE_EMBEDDINGS?

After enabling external AI runtimes and creating an external model, call AI_GENERATE_EMBEDDINGS(N'your text' USE MODEL YourModelName). This returns the vector directly, avoiding the need to parse JSON from raw REST responses.

Article tags

About the author

Dr Greg Low is a member of the Microsoft Regional Director program that Microsoft describe as “150 of the world's top technology visionaries chosen specifically for their proven cross-platform expertise, community leadership, and commitment to business results”. He is the founder and principal consultant at SQL Down Under, a boutique data-related consultancy operating from Australia. Greg is a long-term data platform MVP and a well-known data community leader and public speaker at conferences world-wide. He is known for his pragmatic attitude to business transformation and to solving issues for business of all sizes. Greg is the host of several data-related podcasts: SQL Down Under, Cosmos Down Under, PG Down Under, and Fabric Down Under, and produces the SDU Tools toolset.

Greg's contributions